strategies: Seq[GenericStrategy[PhysicalPlan]]
QueryPlanner
— From Logical to Physical Plans
QueryPlanner
transforms a logical query through a chain of GenericStrategy
objects to produce a physical execution plan, i.e. SparkPlan for SparkPlanner or the Hive-Specific SparkPlanner.
QueryPlanner
Contract
QueryPlanner
contract defines the following operations:
-
Abstract strategies
-
Concrete plan
-
Protected collectPlaceholders
-
Protected prunePlans
Note
|
Protected collectPlaceholders and prunePlans are supposed to be defined by subclasses and are used in the concrete plan method.
|
strategies
Method
strategies
abstract method returns a collection of GenericStrategy
objects (that are used in plan method).
plan
Method
plan(plan: LogicalPlan): Iterator[PhysicalPlan]
plan
returns an Iterator[PhysicalPlan]
with elements being the result of applying each GenericStrategy
object from strategies collection to plan
input parameter.
collectPlaceholders
Method
collectPlaceholders(plan: PhysicalPlan): Seq[(PhysicalPlan, LogicalPlan)]
collectPlaceholders
returns a collection of pairs of a given physical and a corresponding logical plans.
SparkStrategies
— Container of SparkStrategy
Strategies
SparkStrategies
is an abstract base QueryPlanner (of SparkPlan) that serves as a "container" (or a namespace) of the concrete SparkStrategy
objects:
-
SpecialLimits
-
StatefulAggregationStrategy
-
Aggregation
-
InMemoryScans
-
StreamingRelationStrategy
Note
|
Strategy is a type alias of SparkStrategy that is defined in org.apache.spark.sql package object.
|
Note
|
SparkPlanner is the one and only concrete implementation of SparkStrategies .
|
Caution
|
FIXME What is singleRowRdd for?
|
Hive-Specific SparkPlanner for HiveSessionState
HiveSessionState
class uses an custom anonymous SparkPlanner for planner
method (part of the SessionState contract).
The custom anonymous SparkPlanner
uses Strategy
objects defined in HiveStrategies
.